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Abstract 

Noncommutative harmonic analysis is used to solve a nonparametric 
estimation problem stated in terms of compound Poisson processes on 
compact Lie groups. This problem of decompounding is a generalization 
of a similar classical problem. The proposed solution is based on a char- 
acteristic function method. The treated problem is important to recent 
models of the physical inverse problem of multiple scattering. 

1 Introduction 

This paper studies the following nonparametric estimation problem. Let (X n ) n >\ 
be i.i.d. G- valued random variables for some group G, and let e denote the 
identity element of G. For example, G might be the group of 3 x 3 orthogonal 
matrices, in which case each X n would be a random 3x3 orthogonal matrix 
and e would be the 3x3 identity matrix. The process 

JV(t) 

Y(t) = J] X n , X = e, 

n=0 

where N = (N(t)) t >o is a Poisson process with parameter A > 0, is called 
a G- valued compound Poisson process. If G is not commutative, the above 
products are taken to be ordered from left to right, and Y(t) is called a left 
compound Poisson process. It is assumed that the random variables X n and 



N(t) are independent of each other, and for simplicity, it is further assumed 
that the Poisson parameter A is known. The general problem is to estimate 
the distribution of the X n given partial observations of one or more realisations 
of the compound Poisson process Y(t). Of specific interest, is the case when 
multiple realisations of Y(T) are available, for some fixed time instant T > 0. 

The real numbers form a group, with addition being the group operation. 
Choosing G to be this group results in the ordinary compound Poisson process 
y(t) = J2n=Q x n where xq — and x n for n > 1 are real- valued i.i.d. random 
variables. Estimating the distribution of the x n is known as decompounding 
and has been well-studied [H|2]. In the present paper, decompounding tech- 
niques are generalised to the case when G is a noncommutative group. This 
generalisation is non-trivial and requires ideas from noncommutative harmonic 
analysis. Although group-valued compound Poisson processes were introduced 
by Applebaum in [3], the corresponding decompounding problem has not been 
addressed in generality before. 

This paper contributes to the relatively recent trend consisting in the appli- 
cation of noncommutative harmonic analysis {i.e. harmonic analysis on groups) 
to estimation and inverse problems. It addresses a nonparametric estimation 
problem stated in terms of compound Poisson processes on compact Lie groups. 
We refer to this as the problem of decompounding on compact Lie groups, since it 
directly generalizes the classical problem of decompounding for scalar processes. 
This generalization is mathematically natural and is motivated by the physical 
inverse problem of multiple scattering. In particular, this paper also contributes 
to the modelling of multiple scattering using compound Poisson processes. 

Compound Poisson processes model the accumulation of rare events. As 
such, scalar compound Poisson processes are important tools in queuing and 
traffic problems and in risk theory. The classical problem of decompounding 
arises in the context of these processes. A functional approach to this problem 
is given by Buchman and Griibel [TJ . A characteristic function method is studied 
by Van Es et al. [2] . The applications of decompounding in queuing problems 
and risk theory are referenced in [T]. We generalize this problem by considering 
decompounding on compact Lie groups. We approach this new problem by using 
noncommutative harmonic analysis to generalize the above mentioned method 
of 0. 

The important potential which noncommutative harmonic analysis holds for 
engineering problems is well illustrated in the book of Chirikjian and Kyatkin [4]. 
Its importance to nonparametric estimation stems from the fact that it leads to 
the successful generalization of the highly important concept of characteristic 
function in probability. In mathematical research, this generalization was pio- 
neered by Grenander [5] and extensively developed by Heyer [6] . It has received 
special attention in the engineering community. See Yazici [T and the papers 
by Kim et al. |U H EB E] . 

The paper is organized as follows. Section [2] sets down the necessary back- 
ground in harmonic analysis and characteristic functions on compact Lie groups. 
Section [3] introduces compound Poisson processes on compact Lie groups. In 
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Section [4] we state the decompounding problem for these processes and present 
our approach based on noncommutative harmonic analysis. In Section [5] we 
propose a model for multiple scattering based on compound Poisson processes 
on the rotation group 50(3). Within this model, decompounding appears as a 
physical inverse problem. We apply our approach as described in Section [4] to 
this problem using numerical simulations. 

2 Characteristic functions on compact Lie groups 

Characteristic functions of scalar and vector- valued random variables are defined 
using the usual Fourier transform. Their extension to random variables with 
values on compact Lie groups owes to the tools of harmonic analysis on these 
groups. Our presentation of characteristic functions is adapted from 112] . 
Harmonic analysis on compact Lie groups is presented in more detail in recent 
papers [5J |7] . More thorough classical references thereon include [T3J [H] . 

Let G be a compact connected Lie group with identity e. We denote by \x the 
biinvariant normalized Haar measure on G. Hilbert spaces of square integrable 
(with respect to (i) complex and real- valued functions on G are noted L 2 (G, C) 
and L 2 (G, K). A representation of G is a continuous homomorphism it: G — * 
GL(V) with V a complex Hilbert space and GL(V) the group of invertible 
bounded linear maps of V. It is called irreducible if any G-invariant subspace 
of V is trivial i.e. equals {0} or V. Two representations 7T; : G — > GL(Vi) -with 
i — 1,2- are called equivalent if there exists an invertible bounded linear map 
L : V% — > V2 such that L o tti = TT2 ° L. Using this relation, the set of irreducible 
representations of G is partitioned into equivalence classes. 

The central result of harmonic analysis on compact groups is the Peter- 
Weyl theorem. For the current context, it can be stated as follows. Let Irr(G) 
be the set of equivalence classes of irreducible representations of G. Irr(G) 
is a countable set. If 6 £ Irr(G) then we have the two following facts. All 
representations of the class 5 have the same finite dimension ds . There exists 
in this class a unitary representation U A . Choosing one such representation 
we can suppose that II s : G — > SU(C ds ) with SU(C ds ) the group of special 
unitary dg x dg matrices. We distinguish the unit representation Sq G Irr(G) 
where U 5 °(g) — 1 for all g £ G. With this choice being fixed, we can state the 
Peter- Weyl theorem. 

Theorem 1 (Peter- Weyl) . The functions dg Uy taken for 6 £ Irr(G) and 
i,j = 1, . . . ,dg form an orthonormal basis of L 2 (G,C). 

Note that [//„■ is the usual notation for the matrix elements of U s . For all 
f £ L 2 (G,C) the theorem gives the Fourier pair 

As - J f{g)U\g)U^ 9 ) (1) 
f{g)= d s tr{A s U s {g)) (2) 

<5eIrr(G) 
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where t denotes the Hermitian conjugate and tr the trace. The Fourier series 
Q converges in L 2 (G,C). 

Consider the example G — S 1 . It is possible to make the identification 
(5 = 0,1,.... Then U s (z) = z s for z G 5 1 . Writting z = e ie for some G [0, 2n], 
this gives the classical Fourier expansion of periodic functions. 

We consider random objects and in particular G-valued random variables 
defined on a suitable probability space (fl,A,V). When referring to the prob- 
ability density of such a random variable X, we mean a probability density 
px G L 2 (G,R) with respect to fx. The characteristic function of a G-valued 
random variable is defined as follows. Compare to [5]. 

Definition 1. Let X be a G-valued random variable. The characteristic func- 
tion of X is the map 4>x given by 

5^4>x{5) = E(U 5 (X)) <5eIrr(G) 

Here E stands for expectation on the underlying probability space. For all 
5 G Irr(G), the expectation in the definition is finite since II s has unitary values. 
When X has a probability density px its characteristic function gives the Fourier 
coefficients of px as in . We have 

4> X {S) = E(U S (X)) = J p(g)U s (gW(g) S G Irr(G) 

The following proposition [T] reminds the relation between characteristic func- 
tions and the concepts of convolution and convergence in distribution. It is a 
generalization of classical properties for scalar random variables. Remember 
that a sequence (X n ) n >i of G-valued random variables is said to converge in 
distribution to a random variable X if for all real-valued continuous function / 
on G we have 

IimE(/(X n )) =E(f(X)) 

n 

The proof of proposition [l] is straightforward. See [5]- 
Proposition 1. The following two properties hold. 

1. Let X and Y be independent G-valued random variables and let Z = XY . 
We have for all S G Irr(G) 

<Pz{5) = <t> x {S)<hr{S) 

2. A sequence (X n ) n >i of G-valued random variables converges in distribu- 
tion to a random variable X iff for all 6 G Irr(G) 

lim^x„(<5) = </>x(8) 

71 

In order to solve our estimation problem in section [4] we will require random 
variables to have certain symmetry properties. We deal with these properties 
here. The following analysis draws on Liao [12j[T5]. 
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We will say that a G- valued random variable X is inverse invariant if X = 
X^ 1 . We will say that it is conjugate invariant if for all k G G we have that X = 
kXkT 1 . As usual = denotes equality in distribution. The following proposition]^ 
characterizes these two symmetry properties in terms of characteristic functions. 
It will be important to remember that for any two G-valued random variables 

X and Y we have X = Y iff (f>x = 4>y- This results from the completeness of 
the basis given by the II s as stated in the Peter- Weyl theorem [5]. 

Proposition 2. The following properties hold. 

1. X is inverse invariant iff for all 5 € Irr(G) we have that 4>x{8) is Hermi- 
tian. 

2. Let X be inverse invariant. If Xi,...,X n are independent copies of X 
then the product X% . . . X n is inverse invariant. 

3. X is conjugate invariant iff for all 5 G Irr(G) we have that (f>x(8) = a$Id s 
where ag G C and Id s is the ds x dg identity matrix. 

4- If X andY are independent and conjugate invariant then XY is conjugate 
invariant. 

5. X is conjugate invariant iff for all G-valued random variable Y indepen- 
dent of X we have XY = YX. 

Proof. 1. Note that for all 8 G Irr(G) we have by the homomorphism property 
of U s and the fact that it has unitary values 

4> x -i{8) = EiU^X- 1 )) = E(U S (X))1 = ^(5)t 

2. This follows from [T] of proposition [2] and [T] of proposition [T] since the 
powers of a Hermitian matrix are Hermitian. 

3. Note that for all k G G we have that X = kXk- 1 iff for all 5 G Irr(G) 

E(U S (X)) = EiU^kXk- 1 )) = U s {k)E{U s {X))U s {k)^ 

identifying <px on both sides, this becomes 

<j )x (5) = U s (k)0x(6)U s (k)l 

If this relation is verified for all k G G then <j>x (8) is a multiple of Id s ■ 
This follows by Schur's lemma |13j . 

4. This follows from [3] of proposition [2] and [T] of proposition [T] 

5. The if part follows by setting Y — k G G for arbitrary k. The only if part 
follows from [3] of proposition [2] and [l] of proposition [l] 

□ 
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[T]of proposition ^motivates a practical recipe for generating inverse invariant 
random variables from general random variables. Let X and Y be G-valucd 

random variables. Suppose X and Y are independent with Y = X~ l . It can be 
verified by[l]of proposition [2] that XY = YX and that both these products are 
inverse invariant. In practice, if we have generated X then we can immediately 
generate Y as above. In this way an inverse invariant XY or YX is generated 
from X. 



3 Compound Poisson Processes 

Compound Poisson processes on groups naturally generalize scalar compound 
Poisson processes. They are introduced by Applebaum in [5J. Let us start 
by reminding the definition of scalar compound Poisson processes. Let N — 
(N(t)) t >o be a Poisson process with parameter A > 0. Suppose {x n )n>i are i.i.d. 
K-valued random variables. Suppose the family (x n ) n >i is itself independent of 
N. The following process y is said to be a compound Poisson process 

N(t) 

y(t) = X! %n 

n=Q 

G-valued compound Poisson processes are defined by analogy to this formula. 
We continue with the process N. Let (X n ) n >i be i.i.d. G-valued random 
variables and suppose as before that the family (X n ) n >i is independent of N. 
The following process Y is said to be a G-valued left compound Poisson process 

N(t) 

Y(t) =Y[X n 

n=0 

We understand that products are ordered from left to right. It is possible to 
obtain a right compound Poisson process by considering Yit) -1 instead. Thus 
the two concepts are equivalent. See [T21 15]. 

Before going on, we make the following remark on the above definition of 
compound Poisson processes. This definition was stated for G a compact con- 
nected Lie group. This topological and manifold structure of G is not necessary 
for the definition, which can be stated in its above form for any group with 
a measurable space structure. The compact connected group structure of G 
allows us to use the Peter- Weyl theorem and characteristic functions. The Lie 
group structure allows the introduction of Brownian noise in Section [4] 

We wish to summarize the symmetry properties of the random variables Y(t) 
for t > 0. Note first that for all t > 0, Y(t) does not have a probability density. 
Indeed, for alH > we have V(Y(t) = e) > F(N(t) = 0) = e~ A *. It follows that 
Y(t) has an atom at e. In the absence of a probability density, we study Y(t) for 
t > using its characteristic function. This is given in the following Proposition 
[3] which can be seen to immediately generalize the well known formula for scalar 
compound Poisson processes. This proposition follows [12] 13], 
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Proposition 3. For all t > the characteristic function <j>Y{t) °f Y(t) is given 
by 

c/) Y(t) (S)=ex P (Xt(4, x (5)-I ds )) (3) 
for 6 € lrr(G), where 4>x = 4>x x - 

Proof. Let t > 0. 4>Y(t) can be calculated by conditioning over the values of 
N{t). Using the independence of N and (X n ) n >i we have for 6 € lrr(G) 



(At)' 

n>0 m—0 

Using the fact that (X n ) n >i are z.z.d. it is possible to replace 

n n 

E [] U 5 (X m ) = J] E(C/ 5 (X m )) = X (<5)" 

m—0 m—0 

the proposition follows by rearranging the sum. □ 

Combining Propositions [3] and [2] we have the following proposition. It states 
that for all t > the symmetry properties of Y(t) are the same as those of the 

Proposition 4. For all t > we have 

1. If X\ is inverse invariant then so is Y(t). 

2. If X\ is conjugate invariant then so is Y(t) . 

We end this section with Proposition [5] It gives a property of uniformization 
of the distribution of Y(t) as t t oo. This is similar to the behavior of the 
products X\ . . . X n for n t oo, see [5]. For a more general version of Proposition 
[5] see [TU [TS] . We say that a G- valued random variable X is supported by a 
measurable subset S of G if W(X £ S) = 1. If X and X' are G- valued random 

variables with X = X' then X is supported by S iff X' is supported by S. 
In Proposition [5] U is a G-valued random variable with probability density 
identically equal to 1. That is, U is uniformly distributed on G. 

Proposition 5. If X\ is not supported by any closed proper subgroup S of G 
or coset gS , g € G of such a subgroup then Y(t) converges in distribution to U 
as t | 00. 

Proof. Under the conditions of the proposition we have for all for all S =/= S 
that the eigenvalues of <t>x{$) are all < 1 in modulus [5]. It follows that the 
eigenvalues of 4>x{8) — Id s all have negative real parts. Thus when 5 ^ S 
we have by ((3|) that </>Y(t)(8) -> as f f 00. Moreover, it is immediate that 
4>Y(t){$o) = 1 for t > 0. We conclude using [2] of Proposition [TJ Note that [13] 

<h{8) = J U s (gW(g) = 5^ S 
and 4>u(5o) = 1 trivially. □ 
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4 Decompounding 



In existing literature, decompounding refers to a set of nonparametric estimation 
problems involving scalar compound Poisson processes [I] [2] . In this section we 
will consider the generalization of these problems to compound Poisson pro- 
cesses on compact Lie groups. The new problems can be stated in the notation 
of Section [3] We refer to them also as decompounding problems. As in the 
scalar case, they consist in estimation of the common probability density (sup- 
posed to exist) of the random variables X n from observations of the process 
Y. The unknown common probability density of the X n will be noted p. We 
are unaware of any work on similar problems for vector-valued compound Pois- 
son processes. Our consideration of compact Lie groups is motivated by the 
applications presented in Section [5] 

4.1 Typology of decompounding problems 

Several decompounding problems can be stated, depending on the nature of the 
observations made of Y [2] ■ Decompounding is performed from high frequency 
observations if an individual trajectory of the process Y is observed over time 
intervals [0,2*] where T t oo. It is performed from low frequency observations if 
i.i.d. observations are made of the random variable Y(T) for a fixed T > 0. 

Decompounding from high and low frequency observations lead to different 
difficulties. For high frequency observations, the problem is greatly simplified if 
the assumption is made that X n does not take the value e, for any n > 1. With 
probability 1, a trajectory of N has infinitely many jumps over t > 0. Under 
the assumption we have made, all these jumps correspond to jumps of Y which 
we do observe. The jumps of Y then give i.i.d. observations of X\ and the 
average time between these jumps is 1/A. In particular, it is important for high 
frequency observations to take the limit T ] oo. 

Low frequency observations do not give direct access to A. In scalar decom- 
pounding from low frequency observations, A is often assumed to be known 012]. 
In the context of a compact group G, Proposition [5] leads to a difficulty that 
does not appear in scalar decompounding. Under the conditions of this propo- 
sition, if low frequency observations are made at a sufficiently large time T then 
these observations will be uniformly distributed on G and will have no memory 
of the random variables X n . 

A third intermediate type of observations is possible. It is possible to make 
observations of an individual trajectory of Y at regular time intervals T, 2T, .... 
This is in fact equivalent to low frequency distributions. Remember that N is 
a Levy process, i.e. has independent stationary increments. Moreover we have 
that the (X n ) n >\ are i.i.d. Using this, it is possible to prove that the G-valued 
random variables 

Y(T),Y(Ty 1 Y(2T),Y(2T)- 1 Y(3T) . . . 

are i.i.d. Thus our observations are i.i.d. observations of Y(T). This remark 
refers to the fact that Y is a left Levy process in G [12] • We do not develop this 
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here. 



4.2 Noise model for low frequency observations 

We will consider decompounding from low frequency observations. T > is 
fixed and i.i.d. observations (Z n ) n >i of a noisy version Z oiY(T) are available. 
Z is given by Y corrupted by multiplicative noise. We have the noise model 

Z = MY(T) (4) 

where M is independent of Y. By[l]of Proposition [l] we have for the character- 
istic function of Z 

4>Z = <f>M(f>Y(T) 

The noise model is equivalent to having an initial value Y(0) = M with a 
general distribution. We consider the case of Brownian noise. The characteristic 
function of M is then given by [12l |8] 



4> M (S) = exp ( -X s — 



where a 2 is a variance parameter and for 5 S Irr(G) the constant As is the 
corresponding eigenvalue of the Laplace-Beltrami operator. In particular, Xs — 
and Aj > for 5 ^ 5 . It is clear from [3j of Proposition [2] that M is conjugate 
invariant. It follows by [4] of Proposition [2] that, as far as the distribution of Z is 
concerned, left and right multiplication of Y(T) by the noise M are indifferent. 

It is possible to construct a G- valued process £ such that Z = ((T). The 
corresponding construction is well known in the theory of group-valued Levy 
processes and is referred to as interlacing [31 [T5] . Here we only state this con- 
struction. Let W be a Brownian motion on G independent of N and with 
variance parameter a 2 . This is a process with continuous paths and indepen- 
dent stationary increments. Moreover, W(0) = e and for 8 £ Irr(G) 



<f>w(t){5) = ex P (~^ s ~2*) Ids 



Let Tq = and suppose (T n ) n >i are the jump times of N. The interlaced 
process C is defined as follows. We have £(0) = e. For t > and n > 1 we have 

C(t) = C(T n -. 1 )W(T r ^ 1 )- 1 W(t) on {T„_! < t < T n } 

where the following formula holds at each time T n (here C(T„— ) denotes the left 
limit at T n ) 

C(T„) = ((T n -)X n 

This definition is sufficient, since T n 1 oo almost surely. The term interlacing 
comes from the fact that the trajectories of C are obtained by introducing the 
jumps of Y into the trajectories of W as these jumps occur. The trajectories of 
W are thus interlaced with the jumps of Y. 
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For t > the characteristic function of C,(t) is given by 

c/> m (6) = exp (t\<p x (5) - tl ds (x+^f^j (5) 

for S e Irr(G). It follows that we have Z = C(T) if To 2 = a 2 . 

Although we do not deal with the case of high frequency observations we 
would like to end this subsection with a remark on the role of noise in this case. 
The trajectories of the interlaced process £ are noisy versions of the trajectories 
of Y. However, these trajectories have the same jumps as the trajectories of Y. 
In this sense, high frequency observations are unaltered by noise. 

4.3 A characteristic function method 

We present a characteristic function method for decompounding from low fre- 
quency observations. This method extends a similar one considered in [2]. In 
carrying out this extension, we are guided by the properties of characteristic 
functions on G presented in Section [2j Our observations (Z n ) n >x and noise 
model Q were described in 4.2 We aim to estimate the common density p of 



the X n . A characteristic function method consists in constructing nonparamet- 
ric estimates for p from parametric estimates for its Fourier coefficients 4>x{5) 
given for 5 G Irr(G). See [5]. 

We suppose that A and a 2 are known. Equation ^ can be copied as follows 

<t>z(S)=exp(T)4x(8)-TXI dt ) 5 e Irr(G) (6) 

where A is a constant determined by A and a 2 . We refer to this transformation 
<px l— * <i>z as the compounding transformation. Decompounding will involve 
local inversion of the compounding transformation. This is clearly related to 
inversion of the matrix exponential in a neighborhood of <pz (S) for all S G Irr(G) . 
Rather than deal with this problem in general, we make the following simplifying 
hypothesis. 

Hypothesis: X\ is inverse invariant. 

For all 5 e Irr(G) we have by applying [T] of Proposition [2] and (|6| to this hypoth- 
esis that 4>z{$) is Hermitian positive definite. Note Log the unique Hermitian 
matrix logarithm of a hermitian positive definite matrix. We can now express 
the inverse of the compounding transformation. From equation ([6]) it follows 
that 

<M<5) = ^Log[<M£)]+(A/A)l d4 <5elrr(G) (7) 

Let 5 € Irr(G). It follows from definition [l] that empirical estimates of 4>z{8) 
based on the observations (Z n ) n >\ are unbiased and consistent. This is a simple 
consequence of the strong law of large numbers. See for example |16j . In order 
to estimate <j>x{$) using ([7]) it is then important to ensure that the empirical 
estimates of 4>z{&) are asymptotically Hermitian positive definite. 
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We start by defining the empirical estimates 4^(6) for S £ Irr(G) and n > 1 

^) = ^E (u s (z m ) + u s (z m y) 

m—l 

Hermitian symmetrization of empirical estimates is necessary for the applica- 
tion of ([7]). Since it is a projection operator, this symmetrization moreover 
contributes to a faster convergence of the 4>z(S) to 4>z{$)- 

Continuous dependence of the spectrum of a matrix on its coefficients is a 
classical result in matrix analysis. Several more or less sophisticated versions of 
this result exist [17] . For a remarkably straightforward statement see [18]. For 
a complex matrix C we will note A(G) its spectrum. For each 5 £ Irr(G) and 
n > 1 define the event Kg by 

fl? = {A(&(a))c]0,oo[} 

For 5 £ Irr(G), the sequence (Rg) n >i controls the convergence of the spectra of 
the empirical estimates tfig^S). In particular, 

P(u„> n m >„ Rf) = iimP(n m >„iC) = 1 

n ~ 

Using the events Rg we can write down well defined estimates of 4>x ■ These are 
noted for 5 £ Irr(G) and n > 1 

4> x (6) = on n-Kg 1 

r x (6) = ^Log[^(«5)] +(\/\)l ds on By 

This expression gives our parametric estimates for the Fourier coefficients of p. 
We use them to construct nonparametric estimates based on an expression of 
the form pi. Let (Ti)i>x be an increasing sequence of finite subsets Ti C Irr(G) 
with the limit Uj^iT; — Irr(G) — {So}. Let K > and for each 6 £ Irr(G) note 

fs = d s e- KXs 

For n > 1 and / > 1 our nonparametric estimate pf is given by 

P?(g) = i + J2fs^(^xm s (9) 1 ) geG (8) 

The subscript / > 1 corresponds to a cutoff or smoothing parameter. Indeed, 
infinitely many representations are excluded from the sum over T;. A more 
complete expression of this fact appears in [8]. When K > the coefficients 
fs form a convolution mask ensuring that the estimates p™ can be taken to 
converge to a smooth probability density. We make this more precise in |4.4| 

It is usual to rewrite expressions similar to ^ in terms of a group invariant 
kernel. See [5] IS] • Such a transformation is not possible here due to the indirect 
nature of our observations. This is in particular related to the more involved 
form of the 4>x($) as given above. 
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4.4 Convergence of parametric and nonparametric esti- 
mates 



Here we discuss the convergence of the parametric and nonparametric estimates 
given in |4.3| Our argument is presented in the form of Propositions [6] and [7] 
below. Proposition y gives the consistency of the parametric estimates 4>^(S). 
Proposition [7] states a subsequent result for the nonparametric estimates pf. 
For Proposition [6] we will need inequalities ([£]) and (10 1. These express 



stability results for the eigenvalues of Hcrmitian matrices and for the Hermitian 
matrix function Log. Let A and B be Hermitian d x d matrices, for some d > 1. 
For 1 < i < d let a* and /3j be the eigenvalues of A and B respectively. Suppose 
they are arranged in nondecreasing order. We have 

E(A-^) 2 <|S-^| 2 (9) 

i=i 

where | . | is the Euclidean matrix norm. This inequality is known as the Wielandt- 
Hoffman theorem. In [T7], it is stated for A and B real symmetric. The general 
case of Hermitian A and B can be obtained from this statement using a canon- 
ical realification isomorphism. 

Suppose A and B are positive definite. For our purpose it is suitable to 
assume both X(A) and X(B) are contained in an interval [k, 1] for some k > 0. 
Under this assumption we have the following Lipschitz property 

|Log(B) -Log(A)| < Vdk- 2 \B~A\ (10) 



In order to obtain (10 1 it is possible to start by expressing Log(A) as follows 



LogCA)= f (A - Id)[t{A - I d ) + Id]~ 1 dt 
Jo 

This expression results from a similar one for the real logarithm applied to each 



eigenvalue of A. Subtracting the same expression for Log(B), (10 1 follows by 
simple calculations. 

Proposition 6. For all S £ Irr(G) we have the limit in probability lim„ = 
4>x(6). 

Proof. We only need to consider S ^ 5q. Indeed, 4>x($o) = 0x(<5o) = 1 for all 
n > 1. Let S ^ Sq, for all n > 1 we have 

1 " 

\r z (s)\o P < — J2 \u s (z rn )\ op + \u s (z m y\ op = i 

m—l 

where |.| op is the operator matrix norm. Passing to the limit, we have the same 
inequality for (j>z(S). It follows that all eigenvalues of <j>^(6) or <pz{5) are < 1. 
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Since 4>z{8) is positive definite, there exists kg > such that \{<j)z{5)) C [kg, 1]. 



For n > 1, note Rg the event 



^ = {A(^W)c[fe/2,l]} 
From inequality ^ we have 

P(fl - < P(|<&(5) - <M<5)I > *«/2) 



Since i?^ C i?^, it follows from inequality (10 1 that 

P(|&0) " <M*)I > e n j$) < P(|<^(<5) - Z (5)| > kje/L) 

for all e > 0, where L = A^/d^/TX. 

The proof can be completed by a usual application of Chcbychev's inequality, 

m n x(S) - <Px(S)\ >e)< ( 8 + 2 f /g2 ) (^f) 2 (11) 

for all e > 0. □ 

Proposition [7] relics on Proposition [6] and the Peter- Weyl theorem. It im- 
plies the existence of sequences [pk)k>\i °f nonparametric estimates given by 
(fsj) , converging to p in probability in L 2 (G, C) with any prescribed rate of con- 
vergence. Convergence in probability in L 2 (G, C) means that the following limit 
in probability holds 

lim||pfc -p\\ = 

k 

where |.| is the L 2 (G, C) norm. It is clear from (JsT) that for all k > 1 we have 
pk G L 2 (G, C). In order to obtain nonparametric estimators in L 2 (G,IR) and 
converging to p in the same sense, it is enough to consider the real parts of the 
pk- The following proof of Proposition [7J implicitly uses Plancherel's formula as 
in®. 

Proposition 7. Putting K — in |#|) ; we have the limit in probability 

limlim \\pf — p\\ = 

l n 

Proof. For I > 1 let p l S £ 2 (G, C) be given by 

Pl(ff) = l+X>(^c(*)tfW) 
<5er, 

for g S G. By the Peter- Weyl theorem, lim; p|| = 0. By ([8]) and Proposition 
[6] we have lim n ||p™ — pjl = in probability for all / > 1. The proposition follows 
by observing that 

m-pw^u-pif + wpt-pw 2 (12) 

for all > 1. □ 
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Proposition [6] obtained convergence in probability of the parametric esti- 
mates </>x(^) f° r a ^ ^ ^ Irr(G). These parametric estimates depend only on the 
observations. In particular, they can be evaluated without any a priori knowl- 
edge of p. By introducing such knowledge, it is possible to define parametric 
estimates <f>\(8) converging in the square mean to the same limits 4>x(8)- For 
5 e Irr(G) and n > 1 the 4>\{5) are given by 



few = 



o 

i 



Log 



r z (s) +(\/\)i d 



where the events are as in the proof of Proposition [fj] and we assume known 
a priori constants kg necessary for their definition. As in ([8]), we can define 
nonparametric estimates pf where for n, I > 1 

pf( 9 ) = i + J2 fs ^ (4> 7 x(6)u 5 (g)^ 9 e G 

<5er. 

For all 6 € Irr(G) and n > 1 we have 



E|0'i(<5)-^(5)| 2 



< 



L 1 fd s 



(13) 



where L' is a constant depending on the product TA. This follows by a reasoning 
similar to the proof of Proposition [6j Moreover, for all n,l > 1 we have after 
putting if = 



p\\ 2 < 



:(d$/ki) + \\ Pl -p\\ 



(14) 



seri 



for the functions p l defined in the proof of Proposition [7] This follows from 
Plancherel's formula in (12). 



We have characterized the convergence of parametric estimates using (111 



and (13 1 and the convergence of nonparametric estimates using (12) and (14 1 



We make the following remarks on these formulae. Inequalities (11) and (13 1 



only give gross bounds for the rate of convergence of parametric estimates. The 
quality of these bounds improves when the constants k$ are greater, i.e. closer 
to the value 1. This is equivalent to the L 2 (G,M.) distance between p and the 
uniform density being greater. This last point can be appreciated in relation to 
the example of figure |5.3| in |5.3| 



(12 1 and (14) describe the convergence of nonparametric estimates in a way 
similar to the one used in [S]. Indeed, the nonparametric estimation error is 
decomposed into two terms. One is given by the parametric estimation error 
and the other depends only on p. This second term is given by the convergence 
of the Fourier series of p. This is determined by the smoothness properties of 
p. We note the two following differences with [5], both related to the indirect 
nature of our observations 
be identified as the 



First, the first and second terms in (|14|) can not 
variance" and "bias" of pf 



Second, (14) characterizes 
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the nonparametric estimation error as depending on the whole spectrum of p 
-through the constants k$- rather than just its smoothness properties. 

We finally return to the role of the parameter K introduced in Q. For 
simplicity, we have put K — for Proposition [7] and inequality (14). Let K > 0. 



The following function p^ £ L 2 (G,M.) is an infinitely differentiable probability 
density [HJ |S] 

PK(g) = l+ fstr(A s U s ( 9 y) (15) 

Using the same K in Q and proceeding as for proposition [jj it is possible to 
obtain the limit in probability 

limlim||p7 - pk\\ = 
l n 

A similar limit also holds for the p™. Note that in addition to being smooth, 
Pk can be chosen arbitrarily close to p in L 2 (G,M.) for K > small enough. 



5 Decompounding on 50(3) and multiple scat- 
tering 

This section fulfills two goals. First, it summarizes recent use of compound 
Poisson processes on the rotation group SO(3) in the modelling of multiple 
scattering and introduces decompounding on 50(3) as a physical inverse prob- 
lem. Second, it illustrates the characteristic function method presented in |4.3| by 
applying it to a numerical example of decompounding on SO(3). nonparametric 
estimation on the rotation group SO(3) has received special attention [TTJ|9]. It 
is important to many concrete applications and constitutes a privileged starting 
point for generalization to compact groups. 



5.1 The compound Poisson model for multiple scattering 

Many experimental and applied settings aim to infer the properties of com- 
plex, e.g. geophysical or biological, media by considering multiple scattering of 
mechanical or electromagnetic waves by these media. Inference problems aris- 
ing in this way are formulated as physical inverse problems within the frame- 
work of various approximations of the exact equations of radiative transfer. 
See HHH3CII]. 

A compound Poisson model for the direct problem of multiple scattering 
was considered by Ning et al. [32]. It is based on a K- valued compound Poisson 
process. Consideration of compound Poisson processes on SO (3) leads to a 
model of multiple scattering which is sufficiently precise as well as amenable 
to statistical treatment. This model extends the validity of the small angles 
approximation of radiative transfer. It also allows the formulation of the physical 
inverse problem of multiple scattering as a statistical nonparametric estimation 
problem. 
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We give an example expanding the above discussion. The development of 
Section[3]is converted into the terminology of radiative transfer, see [23]. Certain 
usual results in harmonic analysis on SO (3) are here referred to freely. They 
are set down in a precise form in |5.2| 

A scalar plane wave is perpendicularly incident upon a plane parallel mul- 
tiple scattering layer of thickness H. The velocity of the wave in the layer is 
normalized so that we have r = I for the mean free time r and mean free path 
£. Coordinates and time origin are chosen so that the wave enters the layer 
at time with direction of propagation s(0) = (0,0,1). After time t in the 
layer this direction of propagation becomes s(t) — (s 1 (t) , s 2 (t) , s 3 (i)) . This is 
considered to be a random variable with values on the unit sphere S 2 C R 3 . 
The distribution of the random variable s(H) is noted Ijj- It is identified with 
the normalized angular pattern of intensity transmitted by the layer. We return 
below to the validity of this identification. 

The interaction of the wave with the layer takes place in the form of a 
succession of scattering events. These are understood as interaction of the 
wave with individual scatterers present at random emplacements throughout 
the layer. The random number of scattering events up to time < t < H 
will be noted N(t). Suppose the n th scattering event takes place at the time 
< T n < H . This affects the direction of propagation as follows 

s(T n ) = s(T n -)X n (16) 

Here X n is a random variable with values in SO (3). It is identified with a 



random orthogonal matrix. Formula ( 16 1 is understood as a matrix equality 



where s(T n ) and s(T n — ) are line vectors. From ( |16| ) and the definition of N(t) 
we can write for < t < H 

(N(t) \ 

S(t) = 8(0) J] Xn] (17) 



A certain number of standard physical hypotheses can be replaced in (17 1. 
This will allow for the random product therein to be exhibited as a conjugate 
invariant compound Poisson process on SO (3). 

Under the condition i <C H it is possible to make the hypothesis that the 
time between successive scattering events has an exponential distribution |21j . 
This allows us to model N(t) as a Poisson process with parameter 1 /£. More- 
over, we suppose the scatterers identical and scattering events independent. 
This amounts to taking the S l O(3)-valued random variables X n to be i.i.d.. If 
the additional assumption is accepted that the number of scattering events is 



independent of the whole outcome of these events then formula (17 1 can be 
rewritten < t < H 

s(t) = s(0)Y(t) (18) 

Where Y is a (left) compound Poisson process on SO(3) with parameter 

It is usual to assume that the random variables X n have a common probability 
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density p. For homogeneity with [4] we mention that p is a square integrable 
probability density with respect to the Haar measure of 50(3). In the theory 
of radiative transfer, p is known as the phase function of the layer [23] . 



In order to simplify the Fourier series of p to a Legendre series (22 1 we 



profit from the physical hypothesis of statistical isotropy. This implies that 



scattering events in the layer as given by ( 16 ) are symmetric around the direction 
of propagation s(T„— ). Statistical isotropy is a valid assumption in a plurality 
of concrete situations. It is verified by analytical models such as Gaussian and 
Henyey-Greenstein phase functions, commonly used to describe scattering in 
geophysical and biological media [2~i] . 

Under the hypothesis of statistical isotropy the phase function p is a zonal 



function in the sense precised in 5.2 It admits a Legendre series (22 1 wherein 
the coefficients a$ for 5 € N are said to form the associated power spectrum of 
heterogenities |23] . If p is the Henyey-Greenstein phase function then the power 
spectrum of heterogenities is given by as — g s for S £ N and p can be expressed 
in the closed form [2H US] 

P(cos0)= ]~f (19) 

(1 + <T — 2g cos 0)2 

In this formula the variable E [0,ir] refers to the scattering angle from an 



individual scatterer. It is given a mathematical definition in formula (22 1 of 



5.2 The parameter g G [0, 1[ is called the anisotropy or asymmetry parameter. 
It can be shown to give the average cosine of the scattering angle 9. For the 
scattering of light waves by water clouds and blood we have respectively g = 0.85 
and g = 0.95, see [35]. 

Proposition [3] of Section [3] can be used to give the angular pattern of trans- 
mitted intensity 1^ in terms of the power spectrum of heterogenities. This is 



expressed in the following equation ( 20 1 . This relates the directly observable 



outcome of multiple scattering in the layer to the constitutive microscopic prop- 
erties of the layer, typically quite difficult to ascertain directly. Replacing in 
Proposition [3] the definition of the process Y of ( 18 1 and using the Legendre 
series (|22| of p we have 



s>o 



- , (26 + ljeTl"^ 1 ' / Pa (cos sin ^ (20) 
2tt f— ! Jo 



For the ratio Ih(@) of intensity transmitted within a pencil of angle 20 around 
S(0) - 

Equation ( |20[ > is well known in the small angles approximation of radiative 
transfer where it is derived under the assumption of strong forward scatter- 
ing |23j . Mathematically, this translates into a phase function p with a sharp 



peak around = 0. Our probabilistic development of equation (20 1 does not 
explicitly make this assumption. However, the identification of Ih with the 
angular pattern of transmitted intensity implicitly requires for all the intensity 
of the wave entering the layer to be transmitted. This precludes an important 
deviation between s(0) and s(H). 
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Equation (20) is an interesting starting point for the formulation of the 
physical inverse problem of multiple scattering. Supposing a situation where 
this equation holds, being able to invert it implies access to the power spec- 
trum of heterogenities or alternatively the phase function from direct intensity 
measurements. This implies inference of physical parameters such as the param- 
eter g of the Henyey-Greenstein phase function or determination of microscopic 
properties such as the shape of individual scatterers [55] , 

Our use of compound Poisson processes on 50(3) to model multiple scat- 



tering lead to the probabilistic counterpart (18 1 of equation (20 1. In relation to 



( 18 ), the physical inverse problem inherent to equation ( |20[ ) is reformulated as a 
statistical estimation problem. This appears as the problem of decompounding 
on 50(3) or some related parametric estimation problem. A crucial difference 
between the two approaches is that they proceed from different types of data. 

Suppose the distribution of s(0) is known and symmetric around (0,0,1) 
-this is the case in many experimental settings. Instead of carrying out mea- 
surements of transmitted intensity, it is possible to make observations of s(H). 
Under the hypothesis of statistical isotropy these observations of s(H) are equiv- 
alent to observations of Y(H). If our objective is to estimate the phase function 
p then we have to deal with decompounding on 50(3) from low frequency ob- 
servations of Y. In many cases, we could be interested in the power spectrum of 
heterogenities or some related physical parameters. We then have to deal with 
a parametric estimation problem. 

5.2 Harmonic analysis on SO (3) 

We here make a short digression on harmonic analysis on 50(3) in order to 



clarify the references made to this subject in 5.1 and to prepare for 5.3 50(3) 
is often used as the archetype compact connected Lie group. Essentially, we will 
specify the Peter- Weyl theorem as stated in Section [2] to the case G = 50(3). 
For the following see [5] or the more detailed account in [3]. 

We use the notation of Section [2] In particular, fi denotes the Haar measure 
of 50(3). It is possible to identify Irr(50(3)) = N so that d s = 26+1 for each 5 € 
Irr(50(3)). With this identification, the most current choice of functions U s : 
50(3) — > SU(dg) can be given in analytical form using the parameterization of 
50(3) by Euler angles. 

The ZYZ Euler angles ip,ip € [0,27r] and 8 £ [0,7r] are well defined coordi- 
nates only on a subset of 50(3). This is however a dense subset in the Euclidean 
topology of 50(3) and has Haar measure equal to 1. Let p : 50(3) — > C. If p is 
continuous or p E L 2 (SO(3), C) it follows that p can be identified with a func- 
tion of the Euler angles p = p(ip,6,ip). The chosen functions II s are extended 
by continuity from the following expression for their matrix elements 

U 5 ab ( V , 8, VO = e- ia ^ b (cos 8)e-^ (21) 

for 5 e Irr(50(3)) and —S < a,b < 6. The notation d s ab is used for the real- 
valued Wigner d-functions, which can be given in terms of the Jacobi polyno- 
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mials. For S £ Irr(50(3)) we have d,Q = P$ the Legendre polynomial of order 
5. 

The Haar measure /i is expressed in the coordinates (ip, 9, ip) as follows 

dfj,(<f, 9, ip) = sin 9dipd9dip 

Suppose a function p <= L 2 (50(3), C) is expressed in the form p(ip, 9, ip). In order 
to obtain its Fourier coefficients, it is enough to replace the above expressions 
for the functions II s and \i in formula |lj . This formula then reduces to a triple 
integral. By the Peter- Weyl theorem, the Fourier coefficients of p give rise to a 
Fourier series approximating p in L 2 (5*0(3), C). 

The class of zonal functions on 5*0(3) arises in relation to the hypothe- 
sis of statistical isotropy mentioned in |5.1| We will say that a function p e 
L 2 (50(3),C) is zonal if p = p(0). That is, if the expression of p in the coor- 
dinates {ip, 9, ip) depends only on 9. Zonal functions form a closed subspace of 
p G L 2 (50(3),C). If p is a zonal function then its Fourier series reduces to a 
Legendre series 

p(6) = 5^(25+ l)a s P s (cos9) (22) 
where for 6 > the Legendre coefficient as is given by 

i r 

a s=o p(9)P s (cos9)sin9d9 (23) 

Identities (22 1 and (23 1 can be found as follows. Let p be a zonal function. For 
S 6 Irr(50(3)) let As be the Fourier coefficients of p obtained by replacement 
in ([lj) . The matrix elements of each Ag are noted Af for -6 < a, b < S. For all 
5, a, b as above we have that A^ b is given by -this follows using ([!]) 

1 



8tt 



e lbip p(9)d d ba (cos 9)e xa v sin 9dipd9di\) 



Thus for all 5 e Irr(50(3)) we have that Af ^ only if a = b = 0. In other 
words the matrix As contains at most one nonzero element. This is the diagonal 
element A® = as given by identity ( 23 1 . Identity ( 22 ) follows by constructing 
the Fourier series of p as in ^ . 



5.3 Numerical simulations 

Here we will illustrate the characteristic function method of |4.3| by applying 
it to a numerical example of decompounding on 50(3). Within this example 
we will consider a parametric estimation problem related to a physical inverse 
problem as described in |5.1| Our example is of a compound Poisson process Y 
on 50(3). As in |5.1[ 50(3)-valued random variables are identified with random 
orthogonal matrices. For t > 0, 

N(t) 

Y(t) = n x n 

n=0 
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where the Poisson process N has parameter A = 0.3 and the random variables 
X n have a common probability density p given by expression (19 1. Four values 



will be considered for the parameter g in this expression: 0.85, 0.9, 0.95 and 0.99. 
We will put T — 10. We simulate a number n of i.i.d. observations of Y(T). 
The following values of n are used: 500,5000 and 50000. Note that on average 
the number N(T) of factors involved in the random product Y(T) is equal to 3. 

Before going on, we confirm that the method of |4.3| can be applied for this 
example. In other words, that the X n with the proposed density p are inverse 
invariant. This follows from the development after identities (22 1 and (23 1. 
Indeed, the matrices A$ obtained for p are diagonal with exactly one nonzero 
diagonal element as = g s ■ Since g is real we have that As is Hermitian for all 
5 e Irr(50(3)). Inverse invariance follows by [l] of Proposition [2] 

We will present three sets of figures. Figure [531 is concerned with the com- 



pounding transformation of p. Figure |5.3| illustrates the influence of n on para- 
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(a) Histogram of cos 9 under density p 
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(b) Histogram of cos 9 under distribution of Y(T) 

Figure 1: Compounding transformation of p (histograms) 
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metric and nonparametric estimation errors. Figure 5.3 studies the influence of 
g on the nonparametric estimation error for fixed n. For figures 5.3 and 5.3 we 
have g = 0.9. For figures |5~3| and [5~3| we have n = 50000. We now comment on 
each of these figures. 



Figure [573] illustrates the relation between the distribution of the X n as given 
by the density p and the distribution of Y(T). Both these distributions are 
studied using histograms. The histogram in figure 1(a) is for the cosine of the 
Euler angle 9 £ [0, tt] associated with the random variable X%. The histogram 
in figure 1(b) is for the cosine of 9 associated with Y(T). 

Figure" |5~3| is concerned with the direct compounding transformation rather 
than the inverse decompounding transformation. It is meant to show the his- 
togram in figure [1(b)] as function of the one in 1(a) As expected, the latter 
histogram appears as a wider version of the former. This corresponds to the 
content of Proposition [5] of Section [3] Note also that the dominant value in 
figure 1(b) has moved away from 9 = 0. 



For figure 5.3 the observations made of Y(T) are used to carry out the de- 



compounding approach of 4.3 Parametric and nonparametric estimation errors 
are given graphically for different values of n. Figure 2(a) compares the esti- 
mated Legendre coefficients of p to their theoretical values as = g s for S > 0. 
In figure |2(b)| a priori knowledge of the analytical form of the as is supposed. 
This is used to estimate g. A different parametric estimate is obtained from 
each estimated Legendre coefficient. In figures |2(a) and |2(b)j theoretical values 
are represented by a solid line. 



In figure |2 (a) we have the estimated first I = 31 Legendre coefficients for each 
value of n. Let us note these coefficients a s l for < 6 < I and the corresponding 
value of n. They can be used to evaluate a nonparametric estimate of p as in 
formula (|8| . This is done by replacing them in a truncated Legendre series (22 1. 
We have the nonparametric estimate of p which we note pf 



Pi 



where for all values of n we have that Sq = ao = 1. Depending on n, the random 
nonparametric estimation error from p? is given by 



£(2<5 + l)(a^-as) 2 + £(2<J + l)a 2 s 



8<l 



8>l 



the 



this is the squared L 2 (SO(3),M.) distance between pf and p. In figure 2(a) 
sum over <5 < I appears as a weighted quadratic deviation between estimated 
and theoretical values. 

In figure 2(b) the estimates a^ are used to give naive estimates gg of g based 
on the analytical form of the ag. The error in each of these estimates g^ is 
directly related to the error in the estimate a 1 }. This latter error is shown for 



each S and n in figure 2(a) The influence of n is not important for small values 
of 5. Visually, the aj? in figure 2(a) agree independently of n for < 5 < 5. For 
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(a) Estimated Legendre coefficients aj? from decompounding 
l.o 
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degree (5) 



(b) Corresponding estimates g£ of g (anisotropy parameter) 

Figure 2: Influence of n (□ = 5 * 10 2 ; o = 5 * 10 3 ; A = 5 * 10 4 ) 



n = 50000 the appear to have a regular dependence on S. For n — 5000 and 
n = 500 we have an irregular dependence of the aj? on 5, especially for S > 20. 
Moreover, for S > 25 we have negative values of a,g, clearly inconsistent with 
the form ag — g s . These values do not allow the evaluation of corresponding 
parametric estimates gf. 

Let us remind that g is an important parameter in multiple scattering appli- 
cations. For multiple scattering media with Henyey-Greenstein phase function 
(19), g is the main parameter characterizing the scattering process. Its estima- 



tion from observations as the ones described in |5.1| is equivalent to a physical 
inverse problem. This leads to the physical interpretation of the parametric 



estimation problem represented in figure [2(b) 
For figure |5.3| we have 



n — 50000. For each value of g we simulated n 
observations of Y(T) and calculated estimates of the Legendre coefficients of p as 
for figure 2(a) Estimated and theoretical Legendre coefficients are respectively 
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Figure 3: Influence of g (o = 0.85; □ = 0.9; A = 0.95; V = 0.99) 



represented by empty and filled in symbols. It is clear from this figure that the 
nonparametric estimation error is smaller for larger values of g. Estimation of 
the Legendre coefficients is virtually exact for g — 0.99. 



In order to understand this behavior we note that g in ( 19 1 gives the con- 
centration of p near the value 9 = 0. Indeed, when g = the function p is 
constant and the random variables X n are uniformly distributed on SO(3). In 
the limit g | 1 we have that each random variable X n is almost surely equal to 
the identity matrix. Conditionally on the event {N(T) > 0}, the distribution 
of Y(T) is a mixture of distributions with Henyey-Greenstein density. More 
precisely, for all n > we have the conditional probability density for the Euler 
angle 9 associated with Y(T) 

1 - n 2n 

p(9\N(T) = n) - 



(1 + g 2n - 2g n cos 0)5 

In particular, in the limit g | 1 we have that Y(T) is almost surely equal to the 
identity matrix. Conditionally on {N(T) > 0}, we have in the limit g I that 
Y(T) is uniformly distributed on SO(3). 



Let us note that in our example F(N(T) > 0) ~ 0.96. Figure 5.3 can be 
understood in light of the above discussion. For greater values of <?, observations 
of Y(T) are concentrated near the identity matrix. This leads to fast conver- 
gence of our estimates for the Legendre coefficients of p. For smaller values of 
g, observations of Y(T) are more dispersed and the convergence of estimates is 
slower. In the limit g J, the observations are close to uniformly distributed on 
5*0(3) and our approach breaks down due to numerical problems. 

6 Conclusion 

Nonparametric estimation on compact Lie groups, especially using characteristic 
function methods, is by now a relatively familiar topic in relation to several 
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engineering applications. It has received comprehensive treatment in the case 
where estimation is carried out directly from some stationary process. That 
is, from i.i.d. observations of a group-valued random variable. This paper has 
applied a characteristic function method to the problem of decompounding on 
compact Lie groups. For this problem, nonparamctric estimation is required 
from indirect observations defined in terms of a nonstationary process. 

A first approach of decompounding on compact Lie groups was given. It was 
guided by existing characteristic function methods for the classical problem of 
decompounding. These methods were transposed directly to the setting of har- 
monic analysis on compact Lie groups. Under a suitable symmetry hypothesis, 
treatment of the indirect nature of observations was simplified. The ensuing 
nonparametric estimation error was characterized as depending on the whole 
spectrum of the target density rather than just its smoothness class. In some 
aspects, our approach of decompounding on compact Lie groups might appear 
summary. We hope however that is will attract attention to various problems 
of the statistics of nonstationary stochastic processes on groups. 

This paper also discussed the importance of decompounding on SO (3) to the 
physical inverse problem of multiple scattering. Under a probabilistic interpre- 
tation of the theory of radiative transfer, models based on compound Poisson 
processes on 50(3) were found consistent with the results of the small angles 
approximation of radiative transfer. The possibility of reformulating physical in- 
verse problems of multiple scattering as parametric or nonparamctric statistical 
estimation problems was discussed. The statistical nature of this new point of 
view seems desirable given the high complexity of multiple scattering situations. 
In practice, it might require considerably more elaborate measurements. 
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